In-context Example Selection with Influences
https://arxiv.org/abs/2302.11042
we use in-context influences to analyze few-shot ICL performance directly from the in-context examples.
Our analysis uncovers up to a 16.3% performance gap between using the most negative in-context examples compared to the most positive.
Figure 1
#SuperGLUE
https://github.com/BrachioLab/incontext_influences
blog https://debugml.github.io/incontext-influences/
分類問題(の話らしい)
trainからfew-shotを選び、validationへの性能を見る (Algorithm 1)
trainからの選び方を変えることでexampleそれぞれの影響度を算出
4.3 Negative vs. Positive Examples
negativeになる理由として、ラベル誤り(など)
4.5 Case study: Example Ordering
recency bias
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Calibrate Before Use: Improving Few-Shot Performance of Language Models
Figure 6
Influence magnitudes are bigger at later positions.
「few-shotでは後の例ほどモデルに影響を与える」
IMO:positiveなexampleだけをあらかじめ選び取れるのか?(手元のデータセットに過学習しないか?)